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Abstract 


This document proposes guidelines for the design of Autonomic Service Agents for autonomic 
networks. Autonomic Service Agents, together with the Autonomic Network Infrastructure, the 
Autonomic Control Plane, and the GeneRic Autonomic Signaling Protocol, constitute base 
elements of an autonomic networking ecosystem. 
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publication by the Internet Engineering Steering Group (IESG). Not all documents approved by 
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Information about the current status of this document, any errata, and howto provide feedback 
on it may be obtained at https://www.rfc-editor.org/info/rfc9222. 
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1. Introduction 


This document proposes guidelines for the design of Autonomic Service Agents (ASAs) in the 
context of an Autonomic Network (AN) based on the Autonomic Network Infrastructure (ANI) 
outlined in the autonomic networking reference model [RFC8993]. This infrastructure makes use 
of the Autonomic Control Plane (ACP) [RFC8994] and the GeneRic Autonomic Signaling Protocol 
(GRASP) [RFC8990]. A general introduction to this environment may be found at [IPJ], which also 
includes explanatory diagrams, anda summary of terminology is in Section 2. 


This document is a contribution to the description of an autonomic networking ecosystem, 
recognizing that a deployable autonomic network needs more than just ACP and GRASP 
implementations. Such an autonomic network must achieve management tasks that a Network 
Operations Center (NOC) cannot readily achieve manually, such as continuous resource 
optimization or automated fault detection and repair. These tasks, and other management 
automation goals, are described at length in [RFC7575]. The net result should be significant 
operational improvement. To achieve this, the autonomic networking ecosystem must include at 
least a library of ASAs and corresponding GRASP technical objective definitions. AGRASP 
objective [RFC8990] is a data structure whose main contents are a name and a value. The value 
consists of a single configurable parameter or a set of parameters of some kind. 


There must also be tools to deploy and oversee ASAs, and integration with existing operational 
mechanisms [RFC8368]. However, this document focuses on the design of ASAs, with some 
reference to implementation and operational aspects. 


There is considerable literature about autonomic agents with a variety of proposals about how 
they should be characterized. Some examples are [DEMOLAO6], [HUEBSCHERO8], [MOVAHEDI12], 
and [GANA13]. However, for the present document, the basic definitions and goals for autonomic 
networking given in [RFC7575] apply. According to RFC 7575, an Autonomic Service Agent is "An 
agent implemented on an autonomic node that implements an autonomic function, either in 
part (in the case of a distributed function) or whole." 


ASAs must be distinguished from other forms of software components. They are components of 
network or service management; they do not in themselves provide services to end users. They 
do, however, provide management services to network operators and administrators. For 
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example, the services envisaged for network function virtualization (NFV) [NFV] or for service 
function chaining (SFC) [RFC7665] might be managed by an ASA rather than by traditional 
configuration tools. 


Another example is that an existing script running within a router to locally monitor or configure 
functions or services could be upgraded to an ASA that could communicate with peer scripts on 
neighboring or remote routers. A high-level API will allow such upgraded scripts to take full 
advantage of the secure ACP and the discovery, negotiation, and synchronization features of 
GRASP. Familiar tasks such as configuring an Interior Gateway Protocol (IGP) on neighboring 
routers or even exchanging IGP security keys could be performed securely in this way. This 
document mainly addresses issues affecting quite complex ASAs, but initially, the most useful 
ASAs may in fact be rather simple evolutions of existing scripts. 


The reference model [RFC8993] for autonomic networks explains further the functionality of 
ASAs by adding the following: 


[An ASA is] a process that makes use of the features provided by the ANI to achieve its 
own goals, usually including interaction with other ASAs via GRASP [RFC8990] or 
otherwise. Of course, it also interacts with the specific targets of its function, using any 
suitable mechanism. Unless its function is very simple, the ASA will need to handle 
overlapping asynchronous operations. It may therefore be a quite complex piece of 
software in its own right, forming part of the application layer above the ANI. 


As mentioned, there will certainly be simple ASAs that manage a single objective in a 
straightforward way and do not need asynchronous operations. In nodes where computing 
power and memory space are limited, ASAs should run at a much lower frequency than the 
primary workload, so CPU load should not be a big issue, but memory footprint in a constrained 
node is certainly a concern. ASAs installed in constrained devices will have limited functionality. 
In such cases, many aspects of the current document do not apply. However, in the general case, 
an ASA may be a relatively complex software component that will in many cases control and 
monitor simpler entities in the same or remote host(s). For example, a device controller that 
manages tens or hundreds of simple devices might contain a single ASA. 


The remainder of this document offers guidance on the design of complex ASAs. Some of the 
material may be familiar to those experienced in distributed fault-tolerant and real-time control 
systems. Robustness and security are of particular importance in autonomic networks and are 
discussed in Sections 9 and 10. 


2. Terminology 
This section summarizes various acronyms and terminology used in the document. Where no 


other reference is given, please consult [RFC8993] or [RFC7575]. 


Autonomic: 
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self-managing (self-configuring, self-protecting, self- healing, self-optimizing), but allowing 
high-level guidance by a central entity such as a NOC 


Autonomic Function: a function that adapts on its own to a changing environment 
Autonomic Node: anode that employs autonomic functions 
ACP: Autonomic Control Plane [RFC8994] 


AN: Autonomic Network; a network of autonomic nodes, which interact directly with each 
other 


ANI: Autonomic Network Infrastructure 


ASA: Autonomic Service Agent; an agent installed on an autonomic node that implements an 
autonomic function, either partially (in the case of a distributed function) or completely 


BRSKI: Bootstrapping Remote Secure Key Infrastructure [RFC8995] 
CBOR: Concise Binary Object Representation[RFC8949] 

GRASP: GeneRric Autonomic Signaling Protocol [RFC8990] 

GRASP API: GRASP Application Programming Interface [RFC8991] 
NOC: Network Operations Center [RFC8368] 


Objective: A GRASP technical objective is a data structure whose main contents are a name and 
a value. The value consists of a single configurable parameter or a set of parameters of some 
kind [RFC8990]. 


3. Logical Structure of an Autonomic Service Agent 


As mentioned above, all but the simplest ASAs will need to support asynchronous operations. 
Different programming environments support asynchronicity in different ways. In this 
document, we use an explicit multi-threading model to describe operations. This is illustrative, 
and alternatives to multi-threading are discussed in detail in connection with the GRASP API (see 
Section 4.3). 


A typical ASA will have a main thread that performs various initial housekeeping actions such as: 


e obtain authorization credentials, if needed 

e register the ASA with GRASP 

e acquire relevant policy parameters 

e declare data structures for relevant GRASP objectives 

e register with GRASP those objectives that it will actively manage 
e launch a self-monitoring thread 

e enter its main loop 
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The logic of the main loop will depend on the details of the autonomic function concerned. 
Whenever asynchronous operations are required, extra threads may be launched. Examples of 
such threads include: 


e repeatedly flood an objective to the AN so that any ASA can receive the objective's latest 
value 


e accept incoming synchronization requests for an objective managed by this ASA 


e accept incoming negotiation requests for an objective managed by this ASA, and then 
conduct the resulting negotiation with the counterpart ASA 


e manage subsidiary non-autonomic devices directly 


These threads should all either exit after their job is done or enter a wait state for new work to 
avoid wasting system resources. 


According to the degree of parallelism needed by the application, some of these threads might be 
launched in multiple instances. In particular, if negotiation sessions with other ASAs are expected 
to be long or to involve wait states, the ASA designer might allow for multiple simultaneous 
negotiating threads, with appropriate use of queues and synchronization primitives to maintain 
consistency. 


The main loop itself could act as the initiator of synchronization requests or negotiation requests 
when the ASA needs data or resources from other ASAs. In particular, the main loop should watch 
for changes in policy parameters that affect its operation and, if appropriate, occasionally 
refresh authorization credentials. It should also do whatever is required to avoid unnecessary 
resource consumption, for example, by limiting its frequency of execution. 


The self-monitoring thread is of considerable importance. Failure of autonomic service agents is 
highly undesirable. To a large extent, this depends on careful coding and testing, with no 
unhandled error returns or exceptions, but if there is nevertheless some sort of failure, the self- 
monitoring thread should detect it, fix it if possible, and, in the worst case, restart the entire ASA. 


Appendix A presents some example logic flows in informal pseudocode. 


4. Interaction with the Autonomic Networking Infrastructure 


4.1. Interaction with the Security Mechanisms 


An ASA by definition runs in an autonomic node. Before any normal ASAs are started, such nodes 
must be bootstrapped into the autonomic network's secure key infrastructure, typically in 
accordance with [RFC8995]. This key infrastructure will be used to secure the ACP (next section) 
and may be used by ASAs to set up additional secure interactions with their peers, if needed. 


Note that the secure bootstrap process itself incorporates simple special-purpose ASAs that use a 
restricted mode of GRASP (Section 4 of [RFC8995]). 


Carpenter, et al. Informational Page 6 


RFC 9222 ASA Guidelines March 2022 


4.2. Interaction with the Autonomic Control Plane 


In anormal autonomic network, ASAs will run as clients of the ACP, which will provide a fully 
secured network environment for all communication with other ASAs, in most cases mediated by 
GRASP (next section). 


Note that the ACP formation process itself incorporates simple special-purpose ASAs that use a 
restricted mode of GRASP (Section 6.4 of [RFC8994]). 


4.3. Interaction with GRASP and its API 


In a node where a significant number of ASAs are installed, GRASP [RFC8990] is likely to run asa 
separate process with its API [RFC8991] available in user space. Thus, ASAs may operate without 
special privilege, unless they need it for other reasons. The ASA's view of GRASP is built around 
GRASP objectives (Section 6), defined as data structures containing administrative information 
such as the objective's unique name, and its current value. The format and size of the value is not 
restricted by the protocol, except that it must be possible to serialize it for transmission in Concise 
Binary Object Representation (CBOR) [RFC8949], subject only to GRASP's maximum message size 
as discussed in Section 6. 


As discussed in Section 3, GRASP is an asynchronous protocol, and this document uses a multi- 
threading model to describe operations. In many programming environments, an "event loop" 
model is used instead, in which case each thread would be implemented as an event handler 
called in turn by the main loop. For this case, the GRASP API must provide non-blocking calls and 
possibly support callbacks. This topic is discussed in more detail in [RFC8991], and other 
asynchronicity models are also possible. Whenever necessary, the GRASP session identifier will be 
used to distinguish simultaneous operations. 


The GRASP API should offer the following features: 


e Registration functions, so that an ASA can register itself and the objectives that it manages. 

e A discovery function, by which an ASA can discover other ASAs supporting a given objective. 

e A negotiation request function, by which an ASA can start negotiation of an objective witha 
counterpart ASA. With this, there is a corresponding listening function for an ASA that wishes 
to respond to negotiation requests and a set of functions to support negotiating steps. Once a 
negotiation starts, it isa symmetric process with both sides sending successive objective 
values to each other until agreement is reached (or the negotiation fails). 

e A synchronization function, by which an ASA can request the current value of an objective 
from a counterpart ASA. With this, there is a corresponding listening function for an ASA that 
wishes to respond to synchronization requests. Unlike negotiation, synchronization is an 
asymmetric process in which the listener sends a single objective value to the requester. 

° A flood function, by which an ASA can cause the current value of an objective to be flooded 
throughout the AN so that any ASA can receive it. 


For further details and some additional housekeeping functions, see [RFC8991]. 
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The GRASP API is intended to support the various interactions expected between most ASAs, such 
as the interactions outlined in Section 3. However, if ASAs require additional communication 
between themselves, they can do so directly over the ACP to benefit from its security. One option 
is to use GRASP discovery and synchronization as a rendezvous mechanism between two ASAs, 
passing communication parameters such as a TCP port number via GRASP. The use of TLS over 
the ACP for such communications is advisable, as described in Section 6.9.2 of [RFC8994]. 


4.4. Interaction with Policy Mechanisms 


At the time of writing, the policy mechanisms for the ANI are undefined. In particular, the use of 
declarative policies (aka Intents) for the definition and management of an ASA's behaviors 
remains a research topic [IBN-CONCEPTS]. 


In the cases where ASAs are defined as closed control loops, the specifications defined in 
[ZSM009-1] regarding imperative and declarative goal statements may be applicable. 


In the ANI, policy dissemination is expected to operate by an information distribution 
mechanism (e.g., via GRASP [RFC8990]) that can reach all autonomic nodes and therefore every 
ASA. However, each ASA must be capable of operating "out of the box" in the absence of locally 
defined policy, so every ASA implementation must include carefully chosen default values and 
settings for all policy parameters. 


5. Interaction with Non-autonomic Components and Systems 


To have any external effects, an ASA must also interact with non-autonomic components of the 
node where it is installed. For example, an ASA whose purpose is to manage a resource must 
interact with that resource. An ASA managing an entity that is also managed by local software 
must interact with that software. For example, if such management is performed by NETCONF 
[RFC6241], the ASA must interact with the NETCONF server as an independent NETCONF client in 
the same node to avoid any inconsistency between configuration changes delivered via 
NETCONF and configuration changes made by the ASA. 


In an environment where systems are virtualized and specialized using techniques such as 
network function virtualization or network slicing, there will be a design choice whether ASAs are 
deployed once per physical node or once per virtual context. A related issue is whether the ANI as 
a whole is deployed once on a physical network or whether several virtual ANIs are deployed. This 
aspect needs to be considered by the ASA designer. 


6. Design of GRASP Objectives 


The design of an ASA will often require the design of a new GRASP objective. The general rules for 
the format of GRASP objectives, their names, and IANA registration are given in [RFC8990]. 
Additionally, that document discusses various general considerations for the design of objectives, 
which are not repeated here. However, note that GRASP, like HTTP, does not provide transactional 
integrity. In particular, steps in a GRASP negotiation are not idempotent. The design of a GRASP 
objective and the logic flow of the ASA should take this into account. One approach, which should 
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be used when possible, is to design objectives with idempotent semantics. If this is not possible, 
typically if an ASA is allocating part of a shared resource to other ASAs, it needs to ensure that the 
same part of the resource is not allocated twice. The easiest way is to run only one negotiation at 
a time. If an ASA is capable of overlapping several negotiations, it must avoid interference 
between these negotiations. 


Negotiations will always end, normally because one end or the other declares success or failure. If 
this does not happen, either a timeout or exhaustion of the loop count will occur. The definition of 
a GRASP objective should describe a specific negotiation policy if it is not self-evident. 


GRASP allows a "dry run" mode of negotiation, where a negotiation session follows its normal 
course but is not committed at either end until a subsequent live negotiation session. If dry run 
mode is defined for the objective, its specification, and every implementation, must consider 
what state needs to be saved following a dry run negotiation, such that a subsequent live 
negotiation can be expected to succeed. It must be clear how long this state is kept and what 
happens if the live negotiation occurs after this state is deleted. An ASA that requests a dry run 
negotiation must take account of the possibility that a successful dry run is followed by a failed 
live negotiation. Because of these complexities, the dry run mechanism should only be supported 
by objectives and ASAs where there is a significant benefit from it. 


The actual value field of an objective is limited by the GRASP protocol definition to any data 
structure that can be expressed in Concise Binary Object Representation (CBOR) [RFC8949]. For 
some objectives, a single data item will suffice, for example, an integer, a floating point number, a 
UTF-8 string, or an arbitrary byte string. For more complex cases, a simple tuple structure such as 
[item1, item 2, item3] could be used. Since CBOR is closely linked to JSON, it is also rather easy to 
define an objective whose value is a JSON structure. The formats acceptable by the GRASP API will 
limit the options in practice. A generic solution is for the API to accept and deliver the value field 
in raw CBOR, with the ASA itself encoding and decoding it via a CBOR library (Section 2.3.2.4 of 
[RFC8991]). 


The maximum size of the value field of an objective is limited by the GRASP maximum message 
size. If the default maximum size specified as GRASP_DEF_MAX_SIZE by [RFC8990] is not enough, 
the specification of the objective must indicate the required maximum message size for both 
unicast and multicast messages. 


A mapping from YANG to CBOR is defined by [CBOR-YANG]. Subject to the size limit defined for 
GRASP messages, nothing prevents objectives transporting YANG in this way. 


The flexibility of CBOR implies that the value field of many objectives can be extended in service, 
to add additional information or alternative content, especially if JSON-like structures are used. 
This has consequences for the robustness of ASAs, as discussed in Section 9. 


7. Life Cycle 


The ASA life cycle is discussed in [AUTONOMIC-FUNCTION], from which the following text was 
derived. It does not cover all details, and some of the terms used would require precise definitions 
in a given implementation. 
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In simple cases, autonomic functions could be permanent, in the sense that ASAs are shipped as 
part of a product and persist throughout the product's life. However, in complex cases, a more 
likely situation is that ASAs need to be installed or updated dynamically because of new 
requirements or bugs. This section describes one approach to the resulting life cycle of individual 
ASAs. It does not consider wider issues such as updates of shared libraries. 


Because continuity of service is fundamental to autonomic networking, the process of seamlessly 
replacing a running instance of an ASA witha new version needs to be part of the ASA's design. 
The implication of service continuity on the design of ASAs can be illustrated along the three 
main phases of the ASA life cycle, namely installation, instantiation, and operation. 


+-------------- + 
Undeployed ------ >| |-=----- > Undeployed 
| Installed | 
+-->| oooh 

Mandate | POSSI SSR SS BSS + | Receives a 
is revoked | BP eee + | Mandate 
EESE] |<--+ 
| Instantiated | 
+-->| Jaap 
set | ap Sa a AP | set 
down | Bal ana + | up 
+--- |<--+ 
| Operational | 
| | 
+-------------- + 


Figure 1: Life Cycle of an Autonomic Service Agent 


7.1. Installation Phase 


We define "installation" to mean that a piece of software is loaded into a device, along with any 
necessary libraries, but is not yet activated. 


Before being able to instantiate and run ASAs, the operator will first provision the infrastructure 
with the sets of ASA software corresponding to its needs and objectives. Such software must be 
checked for integrity and authenticity before installation. The provisioning of the infrastructure 
is realized in the installation phase and consists of installing (or checking the availability of) the 
pieces of software of the different ASAs in a set of Installation Hosts within the autonomic 
network. 


There are three properties applicable to the installation of ASAs: 


e The dynamic installation property allows installing an ASA on demand, on any hosts 
compatible with the ASA. 


° The decoupling property allows an ASA on one machine to control resources in another 
machine (known as "decoupled mode"). 


e The multiplicity property allows controlling multiple sets of resources from a single ASA. 
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These three properties are very important in the context of the installation phase as their 
variations condition how the ASA could be installed on the infrastructure. 


7.1.1. Installation Phase Inputs and Outputs 


Inputs are: 


e [ASA_type]: specifies which ASA to install. 
e [Installation_target_infrastructure]: specifies the candidate installation Hosts. 


e [ASA_placement_function]: specifies how the installation phase will meet the operator's 
needs and objectives for the provision of the infrastructure. This function is only useful in the 
decoupled mode. It can be as simple as an explicit list of hosts on which the ASAs are to be 
installed, or it could consist of operator-defined criteria and constraints. 


The main output of the installation phase is a [List_of_ASAs] installed on [List_of_hosts]. This 
output is also useful for the coordination function where it acts as a static interaction map (see 
Section 8.1). 


The condition to validate in order to pass to next phase is to ensure that [List_of_ASAs] are 
correctly installed on [List_of_hosts]. A minimum set of primitives to support the installation of 
ASAs could be the following: install (List_of_ASAs, Installation_target_infrastructure, 
ASA_placement_function) and uninstall (List_of_ASAs). 


7.2. Instantiation Phase 


We define "instantiation" as the operation of creating a single ASA instance from the 
corresponding piece of installed software. 


Once the ASAs are installed on the appropriate hosts in the network, these ASAs may start to 
operate. From the operator viewpoint, an operating ASA means the ASA manages the network 
resources as per the objectives given. At the ASA local level, operating means executing their 
control loop algorithm. 


There are two aspects to take into consideration. First, having a piece of code installed and 
available to run on a host is not the same as having an agent based on this piece of code running 
inside the host. Second, in a coupled case, determining which resources are controlled by an ASA 
is straightforward (the ASA runs on the same autonomic node as the resources it is controlling). 
In a decoupled mode, determining this is a bit more complex: a starting agent will have to either 
discover the set of resources it ought to control, or such information has to be communicated to 
the ASA. 


The instantiation phase of an ASA covers both these aspects: starting the agent code (when this 
does not start automatically) and determining which resources have to be controlled (when this 
is not straightforward). 
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7.2.1. Operator's Goal 


Through this phase, the operator wants to control its autonomic network regarding at least two 
aspects: 


1. determine the scope of autonomic functions by instructing which network resources have to 
be managed by which autonomic function (and more precisely by which release of the ASA 
software code, e.g., version number or provider). 


2. determine how the autonomic functions are organized by instantiating a set of ASAs across 
one or more autonomic nodes and instructing them accordingly about the other ASAs in the 
set as necessary. 


In this phase, the operator may also want to set goals for autonomic functions, e.g., by 
configuring GRASP objectives. 


The operator's goal can be summarized in an instruction to the autonomic ecosystem matching 
the following format, explained in detail in the next sub-section: 


[Instances_of_ASA_type] ready to control [Instantiation_target_infrastructure] with 
[Instantiation_target_parameters] 


7.2.2. Instantiation Phase Inputs and Outputs 


Inputs are: 


e [Instances_of_ASA_type]: specifies which ASAs to instantiate 


e [Instantiation_target_infrastructure]: specifies which resources are to be managed by the 
autonomic function; this can be the whole network or a subset of it like a domain, a physical 
segment, or even a specific list of resources. 


e [Instantiation_target_parameters]: specifies which GRASP objectives are to be sent to ASAs 
(e.g., an optimization target) 


Outputs are: 


e [Set_of_ASA_resources_relations]: describes which resources are managed by which ASA 
instances; this is not a formal message but a resulting configuration log for a set of ASAs. 


7.2.3. Instantiation Phase Requirements 
The instructions described in Section 7.2 could be either of the following: 


e Sent to a targeted ASA. In this case, the receiving Agent will have to manage the specified list 
of [Instantiation_target_infrastructure], with the [Instantiation_target_parameters]. 


e Broadcast to all ASAs. In this case, the ASAs would determine from the list which ASAs would 
handle which [Instantiation_target_infrastructure], with the 
[Instantiation_target_parameters]. 
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These instructions may be grouped as a specific data structure referred to as an ASA Instance 
Mandate. The specification of such an ASA Instance Mandate is beyond the scope of this 
document. 


The conclusion of this instantiation phase is a set of ASA instances ready to operate. These ASA 
instances are characterized by the resources they manage, the metrics being monitored, and the 
actions that can be executed (like modifying certain parameter values). The description of the 
ASA instance may be defined in an ASA Instance Manifest data structure. The specification of 
such an ASA Instance Manifest is beyond the scope of this document. 


The ASA Instance Manifest does not only serve informational purposes such as 
acknowledgement of successful instantiation to the operator but is also necessary for further 
autonomic operations with: 


e coordinated entities (see Section 8.1) 


e collaborative entities with purposes such as to establish knowledge exchange (some ASAs 
may produce knowledge or monitor metrics that would be useful for other ASAs) 


7.3. Operation Phase 


During the operation phase, the operator can: 


e activate/deactivate ASAs: enable/disable their autonomic loops 
e modify ASA targets: set different technical objectives 


e modify ASAs managed resources: update the Instance Mandate to specify a different set of 
resources to manage (only applicable to decoupled ASAs) 


During the operation phase, running ASAs can interact with other ASAs: 


e in order to exchange knowledge (e.g., an ASA providing traffic predictions to a load balancing 
ASA) 


e in order to collaboratively reach an objective (e.g., ASAs pertaining to the same autonomic 
function will collaborate, e.g., in the case of a load balancing function, by modifying link 
metrics according to neighboring resource loads) 


During the operation phase, running ASAs are expected to apply coordination schemes as per 
Section 8.1. 


7.4. Removal Phase 


When an ASA is removed from service and uninstalled, the above steps are reversed. It is 
important that its data, especially any security key material, is purged. 
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8. Coordination and Data Models 


8.1. Coordination between Autonomic Functions 


Some autonomic functions will be completely independent of each other. However, others are at 
risk of interfering with each other; for example, two different optimization functions might both 
attempt to modify the same underlying parameter in different ways. In a complete system, a 
method is needed for identifying ASAs that might interfere with each other and coordinating their 
actions when necessary. 


8.2. Coordination with Traditional Management Functions 


Some ASAs will have functions that overlap with existing configuration tools and network 
management mechanisms such as command-line interfaces, DHCP, DHCPv6, SNMP, NETCONF, 
and RESTCONF. This is, of course, an existing problem whenever multiple configuration tools are 
in use by the NOC. Each ASA designer will need to consider this issue and how to avoid clashes and 
inconsistencies in various deployment scenarios. Some specific considerations for interaction 
with OAM tools are given in [RFC8368]. As another example, [RFC8992] describes how autonomic 
management of IPV6 prefixes can interact with prefix delegation via DHCPV6. The description of 
a GRASP objective and of an ASA using it should include a discussion of any such interactions. 


8.3. Data Models 


Management functions often include a shared data model, quite likely to be expressed in a 
formal notation suchas YANG. This aspect should not be an afterthought in the design of an ASA. 
To the contrary, the design of the ASA and of its GRASP objectives should match the data model; as 
noted in Section 6, YANG serialized as CBOR may be used directly as the value of a GRASP 
objective. 


9. Robustness 


It is of great importance that all components of an autonomic system are highly robust. Although 
ASA designers should aim for their component to never fail, it is more important to design the 
ASA to assume that failures will happen and to gracefully recover from those failures when they 
occur. Hence, this section lists various aspects of robustness that ASA designers should consider: 


1. If despite all precautions, an ASA does encounter a fatal error, it should in any case restart 
automatically and try again. To mitigate a loop in case of persistent failure, a suitable pause 
should be inserted before such a restart. The length of the pause depends on the use case; 
randomization and exponential backoff should be considered. 


2. If a newly received or calculated value for a parameter falls out of bounds, the corresponding 
parameter should be either left unchanged or restored to a value known to be safe in all 
configurations. 
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3. If a GRASP synchronization or negotiation session fails for any reason, it may be repeated 
after a suitable pause. The length of the pause depends on the use case; randomization and 
exponential backoff should be considered. 


4. If a session fails repeatedly, the ASA should consider that its peer has failed, and it should 
cause GRASP to flush its discovery cache and repeat peer discovery. 


5. In any case, it may be prudent to repeat discovery periodically, depending on the use case. 


6. Any received GRASP message should be checked. If it is wrongly formatted, it should be 
ignored. Within a unicast session, an Invalid message (M_INVALID) may be sent. This 
function may be provided by the GRASP implementation itself. 


7. Any received GRASP objective should be checked. Basic formatting errors like invalid CBOR 
will likely be detected by GRASP itself, but the ASA is responsible for checking the precise 
syntax and semantics of a received objective. If it is wrongly formatted, it should be ignored. 
Within a negotiation session, a Negotiation End message (M_END) witha Decline option 
(O_DECLINE) should be sent. An ASA may log such events for diagnostic purposes. 


8. On the other hand, the definitions of GRASP objectives are very likely to be extended, using 
the flexibility of CBOR or JSON. Therefore, ASAs should be able to deal gracefully with 
unknown components within the values of objectives. The specification of an objective 
should describe how unknown components are to be handled (ignored, logged and ignored, 
or rejected as an error). 


9. If an ASA receives either an Invalid message (M_INVALID) or a Negotiation End message 
(M_END) with a Decline option (O_DECLINE), one possible reason is that the peer ASA does 
not support a new feature of either GRASP or the objective in question. In sucha case, the ASA 
may choose to repeat the operation concerned without using that new feature. 


10. All other possible exceptions should be handled in an orderly way. There should be no such 
thing as an unhandled exception (but see point 1 above). 


At a slightly more general level, ASAs are not services in themselves, but they automate services. 
This has a fundamental impact on howto design robust ASAs. In general, when an ASA observes a 
particular state (1) of operations of the services/resources it controls, it typically aims to improve 
this state to a better state, say (2). Ideally, the ASA is built so that it can ensure that any error 
encountered can still lead to returning to (1) instead of a state (3), whichis worse than (1). One 
example instance of this principle is "make-before-break" used in reconfiguration of routing 
protocols in manual operations. This principle of operations can accordingly be coded into the 
operation of an ASA. The GRASP dry run option mentioned in Section 6 is another tool helpful for 
this ASA design goal of "test-before-make". 


10. Security Considerations 


ASAs are intended to run in an environment that is protected by the Autonomic Control Plane 
[RFC8994], admission to which depends on an initial secure bootstrap process such as BRSKI 
[RFC8995]. Those documents describe security considerations relating to the use of and properties 
provided by the ACP and BRSKI, respectively. Such an ACP can provide keying material for mutual 
authentication between ASAs as well as confidential communication channels for messages 
between ASAs. In some deployments, a secure partition of the link layer might be used instead. 
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GRASP itself has significant security considerations [RFC8990]. However, this does not relieve 
ASAs of responsibility for security. When ASAs configure or manage network elements outside the 
ACP, potentially in a different physical node, they must interact with other non-autonomic 
software components to perform their management functions. The details are specific to each 
case, but this has an important security implication. An ASA might act as a loophole by which the 
managed entity could penetrate the security boundary of the ANI. Thus, ASAs must be designed to 
avoid loopholes such as passing on executable code or proxying unverified commands and 
should, if possible, operate in an unprivileged mode. In particular, they must use secure coding 
practices, e.g., carefully validate all incoming information and avoid unnecessary elevation of 
privilege. This will apply in particular when an ASA interacts with a management component 
such as a NETCONF server. 


A similar situation will arise if an ASA acts as a gateway between two separate autonomic 
networks, i.e., it has access to two separate ACPs. Such an ASA must also be designed to avoid 
loopholes and to validate incoming information from both sides. 


As a reminder, GRASP does not intrinsically provide transactional integrity (Section 6). 


As appropriate to their specific functions, ASAs should take account of relevant privacy 
considerations [RFC6973]. 


The initial version of the autonomic infrastructure assumes that all autonomic nodes are trusted 
by virtue of their admission to the ACP. ASAs are therefore trusted to manipulate any GRASP 
objective simply because they are installed on a node that has successfully joined the ACP In the 
general case, a node may have multiple roles, and a role may use multiple ASAs, each using 
multiple GRASP objectives. Additional mechanisms for the fine-grained authorization of nodes 
and ASAs to manipulate specific GRASP objectives could be designed. Meanwhile, we repeat that 
ASAs should run without special privilege if possible. Independently of this, interfaces between 
ASAs and the router configuration and monitoring services of the node can be subject to 
authentication that provides more fine-grained authorization for specific services. These 
additional authentication parameters could be passed to an ASA during its instantiation phase. 


11. IANA Considerations 


This document has no IANA actions. 
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Appendix A. Example Logic Flows 


This appendix describes generic logic flows that combine to act as an Autonomic Service Agent 
(ASA) for resource management. Note that these are illustrative examples and are in no sense 
requirements. As long as the rules of GRASP are followed, a real implementation could be 
different. The reader is assumed to be familiar with GRASP [RFC8990] and its conceptual API 
[RFC8991]. 


Acomplete autonomic function for a distributed resource will consist of a number of instances of 
the ASA placed at relevant points in a network. Specific details will, of course, depend on the 
resource concerned. One example is IP address prefix management, as specified in [RFC8992]. In 
this case, an instance of the ASA will exist in each delegating router. 


An underlying assumption is that there is an initial source of the resource in question, referred to 
here as an origin ASA. The other ASAs, known as delegators, obtain supplies of the resource from 
the origin, delegate quantities of the resource to consumers that request it, and recover it when no 
longer needed. 


Another assumption is there is a set of network-wide policy parameters, which the origin will 
provide to the delegators. These parameters will control how the delegators decide how much 
resource to provide to consumers. Thus, the ASA logic has two operating modes: origin and 
delegator. When running as an origin, it starts by obtaining a quantity of the resource from the 
NOC, and it acts as a source of policy parameters, via both GRASP flooding and GRASP 
synchronization. (In some scenarios, flooding or synchronization alone might be sufficient, but 
this example includes both.) 


When running as a delegator, it starts with an empty resource pool, acquires the policy 
parameters by GRASP synchronization, and delegates quantities of the resource to consumers 
that request it. Both as an origin and as a delegator, when its pool is low, it seeks quantities of the 
resource by requesting GRASP negotiation with peer ASAs. When its pool is sufficient, it hands out 
resource to peer ASAs in response to negotiation requests. Thus, over time, the initial resource 
pool held by the origin will be shared among all the delegators according to demand. 


In theory, a network could include any number of origins and any number of delegators, with the 
only condition being that each origin's initial resource pool is unique. A realistic scenario is to 
have exactly one origin and as many delegators as you like. A scenario with no origin is useless. 


An implementation requirement is that resource pools are kept in stable storage. Otherwise, if a 
delegator exits for any reason, all the resources it has obtained or delegated are lost. If an origin 
exits, its entire spare pool is lost. The logic for using stable storage and for crash recovery is not 
included in the pseudocode below, which focuses on communication between ASAs. Since GRASP 
operations are not intrinsically idempotent, data integrity during failure scenarios is the 
responsibility of the ASA designer. This is a complex topic in its own right that is not discussed in 
the present document. 
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The description below does not implement GRASP's dry run function. That would require 
temporarily marking any resource handed out in a dry run negotiation as reserved, until either 
the peer obtains it in a live run, or a suitable timeout occurs. 


The main data structures used in eachinstance of the ASA are: 


e resource_pool: an ordered list of available resources, for example. Depending on the nature 
of the resource, units of resource are split when appropriate, and a background garbage 
collector recombines split resources if they are returned to the pool. 


e delegated_list: where a delegator stores the resources it has given to subsidiary devices. 


Possible main logic flows are below, using a threaded implementation model. As noted above, 
alternative approaches to asynchronous operations are possible. The transformation to an event 
loop model should be apparent; each thread would correspond to one event in the event loop. 


The GRASP objectives are as follows: 


e ["EX1.Resource", flags, loop_count, value], where the value depends on the resource 
concerned but will typically include its size and identification. 


e ["EX1.Params", flags, loop_count, value], where the value will be, for example, a JSON object 
defining the applicable parameters. 


In the outline logic flows below, these objectives are represented simply by their names. 
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MAIN PROGRAM: 


Create empty resource_pool (and an associated lock) 


Create e 
Determin 


mpty delegated_list 
e whether to act as origin 


if origin: 
Obtain initial resource_pool contents from NOC 


Obta 
Register 


Register GRASP objectives EX1.Resource and EX1.Params 


in value of EX1.Params from NOC 
ASA with GRASP 


if origin: 


Star 


Start SYNCHRONIZER listener for EX1.Params 


t FLOODER thread to flood EX1.Params 


Start MAIN_NEGOTIATOR thread for EX1.Resource 


if not o 


Obtain value of EX1.Params from GRASP flood or synchronization 


Star 


rigin: 


t DELEGATOR thread 


Start GARBAGE_COLLECTOR thread 


good_pee 


PESTRNONE 


do forever: 


1RR Ir 


esource_pool is low: 
Calculate amount A of resource needed 


Discover peers using GRASP M_DISCOVER / M_RESPONSE 


if good_peer in peers: 
peer = good_peer 
else: 
peer = #any choice among peers 


grasp.request_negotiate("EX1.Resource", 


#i.e., send negotiation request 


Wait for response (M_NEGOTIATE, M_END or M_WAIT) 


IF OK: 


peer) 


if offered amount of resource sufficient: 


Send M_END + O_ACCEPT #negotiation succeeded 


Add resource to pool 


good_peer = peer #remember this choice 


else: 


Send M_END + O_DECLINE #negotiation failed 
good_peer = none #forget this choice 
sleep() #periodic timer suitable for application scenario 


MAIN_NEGOTIATOR thread: 


do forever: 


gras 
#i.e 


Start a separate new NEGOTIATOR thread for requested amount A 
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NEGOTIATOR thread: 


Request resource amount A from resource_pool 


if not OK: 
while not OK and A > Amin: 
A = A-1 


Request resource amount A from resource_pool 
if OK: 
Offer resource amount A to peer by GRASP M_NEGOTIATE 
if received M_END + O_ACCEPT: 
#negotiation succeeded 
elif received M_END + O_DECLINE or other error: 
#negotiation failed 
Return resource to resource_pool 
else: 
Send M_END + O_DECLINE #negotiation failed 
#thread exits 


DELEGATOR thread: 


do forever: 
Wait for request or release for resource amount A 
if request: 
Get resource amount A from resource_pool 


HUL 
Delegate resource to consumer #atomic 
Record in delegated_list #operation 
else: 


Signal failure to consumer 
Signal main thread that resource_pool is low 
else: 
Delete resource from delegated_list 
Return resource amount A to resource_pool 


SYNCHRONIZER thread: 
do forever: 


Wait for M_REQ_SYN message for EX1.Params 
Reply with M_SYNCH message for EX1.Params 


FLOODER thread: 
do forever: 


Send M_FLOOD message for EX1.Params 
sleep() #periodic timer suitable for application scenario 
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GARBAGE_COLLECTOR thread: 


do forever: 
Search resource_pool for adjacent resources 
Merge adjacent resources 
sleep() #periodic timer suitable for application scenario 
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